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In that paper we discuss possibilities of using the Artificial Neural Network tech- 
nic for the individual Extensive Air Showers data evaluation. It is shown that the 
t>- ! recently developed new computational methods can be used in studies of EAS regis- 

tered by very large and complex detector systems. The ANN can be used to classify 
qq ■ showers due to e.g. primary particle mass as well as to find a particular EAS pa- 

rameter like e.g. total muon number. The examples of both kinds of analysis are 
' given and discussed. 
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1 Introduction 



The using of Artificial Neural Network (ANN) for solving a very different phys- 
ical problems become very extensive and promising in the last years ([1]). The 
stage of complexity of physical processes together with the huge number of 
data to be proceed makes the " classical type" calculation so complicated that 
the results sometimes can be reached on the very edge of reality. One of the 
best examples of such situation is the Extensive Air Shower (EAS) physics. 
It is possible, in principle, to describe the EAS using the Monte-Carlo tech- 
nics. Main processes concerning particle traverse through the air are known to 
some extend. Some believe that the contemporary knowledge of high energy 
hadronic collisions is good enough to lead us to solutions of some important 
cosmic ray physics problems. Even if it is so, the problem is how to evaluate 
the physical answer from the cosmic ray experiment data. The only one way 
possible at present is to compare the results of Monte-Carlo calculations with 
the measurements. The most common result of such comparison is the com- 
plain that the data obtained from the measurement are not in a very close 
agreement with the assumed model Monte-Carlo predictions. (Some "physi- 
cal" discussions are about the question: "How far they disagree?". But this is 
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not a physical question!) The main goal is in the complexity of the problem. 
(The Monte-Carlo programs used are about hundreds of thousands line long. 
The first question is how much we can trust them. This is also not a "physical" 
question, but we should bear it in minds.) The best, most popular I should 
say, Monte-Carlo codes to simulate a high energy hadronic interactions have 
to contain two main sources of uncertainty. First, due to the absence of the 
theory of strong interactions to build models some more or less strong physical 
assumptions about the interaction pictures are needed. Some simplifications 
during the modeling is a natural way to do so and it can also be treated as 
a source of uncertainty ("unremovable"). The another one is the values of 
the model parameters which can not be taken from the theory and have to 
be fitted to some (mainly accelerator) experiment results. The contemporary 
Monte-Carlo programs needs about hundred parameter (some of them very 
well known, some unknown at all). The question arises: how the cosmic ray 
physics can be driven from such a unpure "theoretical" predictions? Let us look 
at that problem from the other side. The very best existing (and those which 
will be build in near future) experimental apparatus consist of large number of 
different detectors distributed over a wide (effective) area. The data collected 
give a possibility to study different characteristics of the EAS. Each of them is 
somehow connected to the different "part" of the shower development. The 
bold letters were used to stress the connection which exists for sure, and which 
is unknown for the reasons given above. The question of the major importance 
in cosmic ray physics is about the nature of primary cosmic rays: energetic 
and mass spectrum. The detail knowledge of the connections between cosmic 
ray flux on the top of the atmosphere and the detector response in the array 
on the ground level is certainly appreciated, but the lack of it do not make 
a further study hopeless. Some general features observed in EAS on ground 
level related to the mass of primary CR particle are more or less model in- 
dependent. Calculations shows that many of the shower parameters depends 
on the mass but all those dependencies are smeared due to individual shower 
development fluctuations. On the other hand, these fluctuations are also re- 
lated to the nature of the primary particle. To get the maximum information 
on the shower registered by a complex and extended array there are at least 
two general ways. First is to get a set of parameters describing the shower 
in the most complete way. This can be understood as a contraction of the 
rough experimental multidimensional space (in which each of the dimension is 
given by a single detector signal) to the much less dimensional space of some 
shower parameters. This reduction can be more or less fortunate and there is 
no arbitrary way to do this. The further analysis of that contracted space can 
be done in a conventional way by comparing the experimental points with the 
Monte-Carlo simulated showers (Ref. [2]). In the case of experiment dedicated 
to primary mass determination the most required contraction is to reduce all 
measurement space to one-dimensional which will be interpreted as a primary 
mass response of the apparatus. Sometimes ones require that the reduction 
procedure should be also the most effective one. This means that it should 
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minimize the interpretation errors. To find such best data evaluation proce- 
dure a different methods can be used. The well defined one is the Principal 
Component Analysis procedure discussed e.g. in Ref. [4]. The disadvantage 
of all that procedures is in the fact that all of them rely very strong on the 
Monte-Carlo simulation programs. The proof of the exactness of the methods 
is very hard and is always connected with the believe of the accuracy of the 
shower development description. 

The second way is the object of that paper. The rough experimental data space 
can be reduced promptly using the Artificial Neural Networks. The ANN can 
be trained on a Monte-Carlo simulated array responses with the known initial 
cosmic ray particle. The reduction of the dimensionality of the output we wish 
to have is done by definition using the general rules established in Monte- 
Carlo sample. The question of the correctness of the ANN procedure is more 
complicated mainly because the theory of such a method is poorly known. 
The nature of the process of network self-organizing is a enigma, and the rules 
developed by the network during training are far from the standard "physical" 
ones. For the very complicated networks the are even hard to extract. The 
proof of correctness is much harder than for a known statistical methods, 
even if possible in general. But, as someone says, "the proof of the pudding 
is in the eating" I want to show that the method can be satisfactory used in 
some cases. 

In the present work I want to show some preliminary results concerning the 
ANN analysis using as an example the muon part of the array of the KAS- 
CADE experiment in Karlsruhe (Ref. [6]). The muon array consists of 192 
detectors of 3.2 m 2 each spread over 200 x 200 m surface. The energy thresh- 
old of muon is 0.1 GeV. 

I want study to what extend the information from these counters can be 
analyzed with the help of ANN technics. 



2 Using ANN for the determination of particular shower parame- 
ters 

The minimizing procedures usually used for estimation of some distribution 
parameters need two conditions to be fulfilled. One is to have a measurement 
of the shower statistically accurate enough to be used. The problem can be 
seen very well in Fig. 1 where the typical muon detector responses for the 
shower in the array of the KASCADE geometry is presented. 

In the most of showers the numbers registered by the muon detectors oscillated 
around the very few. The large statistical density fluctuations are expected. 



3 




Fig. 1. The muon detector response for the typical shower initiated by primary 
proton of energy of 10 1 5 eV in the array KASCADE experiment geometry. 

Another condition is that the assumption about the real distribution one try- 
ing to fit is a correct one. In literature there are usually a few possibilities 
and which one to chose is a question of taste. The differences is expected not 
to be large, but a word "large" has not a precise meaning. The ANN method 
is, by definition, not bothered by both such problems. To show that, I want 
to present the results concerning the total muon number determination for 
individual showers by the KASCADE-like experiments. It should be pointed 
out here also that the traditional minimization methods of the muon lateral 
distribution determination need additional information about the Extensive 
Air Shower usually obtained from the measurement of the electromagnetic 
component part of the experiment. They are the shower core position and 
the shower axis inclination angles. For the results presented in that section 
those shower parameters are not used. The estimation of the muon size of the 
shower using ANN is based only on the muon component registration. Then 
schematic view of the ANN architecture is given in Fig. 2. 

The input contain 192 signals from the ideal detectors measured the numbers 
of muons passing the detector surface. Each of the inputs is connected with 
each of the first hidden level neurons. The analysis was performed the network 
with the two hidden levels with different number of neurons to see the effect 
of the network size. The last hidden level is connected to the one output unit. 
The number of the network parameters to be trained was from about 25000 
to 500000. As the response function the common sigmoid function 

output = 1 + A \^ mmt (1) 

was used and the training was the standard back-propagation method. To 
reach the results about hundreds of thousands showers have been used for 
training procedure. The direct use of the Monte-Carlo simulation program for 
training was simply impossible so the special pseudo-Monte-Carlo generator 
was developed. The semi-empirical description of CORSIKA v4.112 (Ref. [5]) 
showers was used. From the point of view of total muon number estimation 
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Fig. 2. Schematic layout of the Artificial Neural Network used for the total number 
of muons in individual EAS evaluation 

the details are not very important. For the same reason only vertical showers 
were used hereafter the present analysis. Results for incline showers do not 
differ much. An example is given in Fig. 3. It should be said that for the 
final ANN tests the exact CORSIKA output showers were used. The results 
presented below justify thus the exactness of our pseudo-Monte-Carlo shower 
generator. 

In the Fig. 3 the convergency of the training procedure is shown for different 
network sizes. On the vertical axis the width of the distribution of the deviation 
of the estimated total muon number from its true value is given. 

As it is seen the further enlargement of the network size do not lead to the 
improvement of the accuracy of the ANN answers. It should be noted that 
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Fig. 3. The efficiency of the ANN as a function of the number of events using for 
the training process. Different solid lines shows the results for different numbers of 
neurons in two hidden levels. The dashed line is a result for incline showers 

the ANN works for incline showers quite well as well. The fluctuations on 
the single detectors for the incline showers increase due to the cos(O) factor 
reduction of the detector effective area. 

The obtained results are presented in Figs. 4-6. 

The accuracy seen in the muon size determination is quite good. The inter- 
esting point is that the network trained with the proton showers only gives in 
the tests some answers for the iron induced showers as it is seen in Fig. 5. The 
bias toward the smaller values is seen what is clearly the result of a difference 
in shapes of proton and iron muon lateral distributions. That bias disappears 
when the training procedure contains also the heavy primaries in the primary 
particle spectrum. 

Another important feature of the ANN method is presented in Fig. 6. The 
ANN is able to give an answer also when only very few detectors are hit. This 
is presented in Fig. 6a. 

The bias seen in Fig. 6a is expected due to existence of the threshold minimum 
4 hit detectors in the training shower sample. 

The comparison of the ANN results with the standard minimization technics 
based method is given in Ref. [3]. There is shown that the spread of the 
estimated total muon number with the respect to the true one obtained using 
the ANN procedure is about the same as for the best minimization methods 
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Fig. 4. The accuracy of determining the total muon number in individual showers 
shown for different shower sizes labeled by the primary particle particle energy a) 
and by the number of hit detectors in the KASCADE-like array b). 
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Fig. 5. The accuracy of determining the total muon number in individual showers 
initiated by primary protons a) and iron nuclei b) of the energy of 10 15 eV. The 
histograms for different number of hit detectors are presented separately as in Fig. 
4b. 
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Fig. 6. The same as in Fig. 4 for very small a) and relatively large b) proton showers 
( of energies 10 14 and 10 16 eV respectively ). 

for the relatively large showers when the statistical weight of an information 
collected by the detectors is high enough and is much smaller for the small 
and very small showers. 



3 Using ANN for the determination of primary particle mass 



It is clear that the primary cosmic ray particle mass is involved in the air 
shower development. The possible mass spectrum consists of course in prin- 
ciple all the stable nuclides (in practice from hydrogen to iron). However the 
fluctuations of the cascading process makes impossible to distinguish the very 
close masses and because of that the primary cosmic ray spectrum is often 
studied like a spectrum of the group of similar nuclei. The whole mass spec- 
trum is divided into the five groups: H, He, light (C-N-O), heavy (about Si) 
and very heavy (Fe). The relative abundances of these groups is known as the 
question of the cosmic ray primary mass spectrum. For such task the network 
architecture was changed. Two different possibilities have been examined. In 
the first the preproceed data were used as an inputs. Instead of direct muon 
detector responses a few parameter of the muon lateral distribution were used. 
The number of 192 input nodes with an information about the muons was re- 
duced to only four: muon densities at two well measured distances (50 and 100 
m) and the relative slopes of the muon lateral distribution these distances. The 
important information about the primary mass composition is also included 



8 



electron detector information muon detector information 




output neurons 



Fig. 7. Schematic layout of the Artificial Neural Network used for the primary 
particle mass determination using the preprocessed muon data. 

in the electromagnetic shower component so another four input nodes were 
introduced with the same as for muon information but about the electron 
lateral distribution. To increase the number of intranetwork connections the 
additional hidden level was introduced. As an output level instead of one the 
five neurons were established each one related to the one group of nuclei in 
the primary cosmic ray mass spectrum. As the ANN answer the number of 
the neuron of the highest output signal was chosen. 

The structure of the network is shown in Fig. 7. 

First the behaviour of the network was tested for different pairs of five possible 
masses in primary spectrum. This was done to see if the ANN is able to 
distinguish between such as close events as produced for example by H (A=l) 
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Fig. 8. The results of separation of H and Fe by ANN using preprocessed muon 
data. The efficiency of the ANN as a function of number of showers used for the 
network training (left) and the final separation results (right). 

and He (A=4). The physical fluctuations in the shower development could 
disperse the information about the primary mass. It was concluded that the 
efficiency of the method in such a cases is rather questionable. However for 
outermost masses in the primary mass spectrum: H and Fe the separation 
is almost perfect as it is shown in Fig. 8. The network efficiency is defined 
as a conditional probability that the ANN output is right for each particular 
primary particle mass. 

Difficulties with the close masses separation oblige to modify the training 
procedure for the network training with the five component mass spectrum. 
Requirement that the ANN answer should be exactly the one known from the 
simulation true primary particle mass gives the training process unsuccessful. 
Thus it has been replaced by the broader presumed output with the maximum 
at the true value but neighborhood output neuron signals were assumed to be 
higher than the much distant ones. With such modification the ANN was 
trained with all five components spectrum and the convergence was found. 
The results are given in Fig. 9. 

The one more interesting possibility of the ANN architecture was examined. 
There were used as an input signals raw detector outputs from all 192 muon 
detectors and as before the preprocessed electron data. The statistical weight 
of an information about muons in each shower is reduced due to the fact that 
the fraction of the muons registered by all detectors in the KASCADE-like 
geometry experiment is about percent of all muons in the shower. The results 
presented previously were obtained using the all muon derived characteristic. 
The physically important question is if the ANN method can be used directly 
with raw experimental data. 

First again the H - Fe separation possibility was tested. The results are pre- 
sented in Fig. 10. 
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Fig. 9. The results of ANN method using preprocessed muon data for all five 
component trained network. The efficiency of the ANN as a function of number of 
showers used for the network training (left) and the final separation results (right). 
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Fig. 10. The results of separation of H and Fe by ANN using raw muon detector 
data. The efficiency of the ANN as a function of number of showers used for the 
network training (a) and the final separation results (b). 

It is surprising that, in contrary to what one could expect, the final efficiency 
achieved is not much worst than that obtained previously (Fig. 8). The only 
one what has changed is the training time. The network needs much more 
simulated showers to reach the final resolution. This is partially due to the 
increase of the number of neurons in the net. 

Next and final step of the present analysis is to see, if the raw muon detector 
data from the KASCADE-like experiment allows one to distinguish between 
different components in the whole primary mass spectrum. The results are 
given in Fig. 11. 

As it can be seen the efficiencies are worst than obtained previously for the 
"all muon data" (Fig. 9). However it is clear that the ANN is able to give 
some information about the primary particle mass. 

The comparison of the efficiencies given in Figs. 9 and 11 shows that the ANN 
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Fig. 11. The results of the primary mass separation by ANN using raw muon 
detector data. The efficiency of the ANN as a function of number of showers used 
for the network training (a) and the final separation results (b). 

approach is very effective and the information collected by the detectors in 
KASCADE-like geometry experiment is rich enough to distinguish between 
the showers initiated by primary protons and iron nuclei only using the muons 
(and electrons) registered by the array detectors. 



4 Summary 

The results presented in that paper show that the ANN method can be used 
to find a total muon number in extensive air showers as well as to classify 
showers due to primary particle mass. 

The comparison of the total muon number obtained using ANN with the 
standard minimization technics based method shows that the ANN proce- 
dure works as good and sometimes even better (for the small and very small 
showers) than the best minimization method. 

The discrimination between different primary CR particle masses looks very 
promising. The further careful study on improvement the ANN sensitivity as 
well as on the accuracy of the EAS Monte-Carlo generator is needed to create 
a powerful tool for data analysis in cosmic ray experiments. 
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